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In  the  relational  model  for  data  base  organization,  the 
basic  concepts  are  mathematical  relations  and  a  family  of 
operators  on  them,  augmente3  with  the  concept  of  functional 
dependency  [2],  [3],  [4],  [5]. 

The  syntactic  properties  of  these  concepts  have  been 
investigated  intensively  and  are  rather  well  understood,  in 
contrast  to  many  other  data  models.  On  the  other  hand,  the 
semantic  properties  of  the  relational  model  are  not  so  well 
understood. 

We  are  using  the  word  ‘semantics'  in  this  paper  in  an  a 
priori  sense,  i.e.,  notions  in  a  formal  system  will  be 
developed  and  used  in  a  way  that  corresponds  to  pre-existing 
intuitive  ideas. 

Two  formal  notions  are  central  in  the  relational  model: 
normalization,  and  functional  dependency.  Informal 
counterparts  for  these  notions  have  not  been  offered;  only 
the  effects  of  their  application  have  been  shown.  For 
example,  normalization  eliminates  certain  kinds  of 
redundancy,  and  unpleasant  insertion-deletion  anomalies  from 
a  collection  of  relations  [3],  [4]. 

In  section  2  we  show  in  some  detail  that  there  are  open 
questions  concerning  the  semantic  nature  of  the  concepts 
"relation",  "normalization",  and  "functional  dependency". 
In  order  to  answer  these  questions  a  basic  semantic  data 
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model  is  introduced  in  section  3.  ?^s  this  model  takes  into 
account  certain  basic  properties  of  the  real  world,  it 
provides  us  with  informal  counterparts  for  its  formal 
notions.  The  principal  idea  is  that  there  are  two  kinds  of 
relationships  among  primitive  object  types  that  are 
essentially  different.  Making  the  distinction  between 
characteristics  and  associations  allows  us  to  partition  a 
data  base  into  independent  object  types  which  are  natural 
insertion  and  deletion  units,  and  associations  among  them. 
Independent  object  types  are  described  by  characteristics 
which  are  meaningful  modification  units. 

The  model  is  applied  to  relational  theory  in  section  4. 
In  particular  the  meaning  of  normalization  is  discussed.  It 
is  shown  that  five  different  types  of  relations  can  be 
distinguished  among  those  in  third  normal  form. 
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2»  The  Heist ign^l  Data  fiodel 

The  relational  data  nodel  proposed  by  Codd  [2]  provides 
a  collection  of  relations  as  the  user  interface  to  a  data 
base.  The  user  is  thereby  protected  from  having  to  know 
about  the  physical  real  organization  of  the  data. 

^  relation,  R,  is  defined  as  a  subset  of  the  Cartesian 
product  over  n  domains,  D1xD2x...Dn.  The  domains,  Di, 
i=1,..,n  are  given  as  sets.  In  normalized  relations,  the 
elements  of  a  domain  are  not  permitted  to  be  relations  or 
sets  themselves. 

^  ^^ta  base  is  a  finite  collection  of  time-varying 
relations  defined  on  a  finite  collection  of  domains  {Di} . 
A.S  the  domains  in  a  relation  need  not  necessarily  be 
distinct,  (distinct)  attribute  names  must  be  given  to  them. 

An  attribute,  B,  of  a  relation,  R,  is  called 
1  liUct ionally  dependent  on  attribute  A.  of  R,  if  at  every 
instant  of  time,  each  value  of  A  has  no  more  than  one  value 
in  B  associated  with  it  under  R.  The  same  definition 
applies  to  a  collection  of  attributes.  We  write  R.A  ->  R.B 
if  R.B  is  functionally  dependent  on  R.A,  and  R.A  />  R.B  if 
it  is  not.  In  the  seguel  we  will  not  make  a  distinction 
between  intra-  and  interrelationa 1  functional  dependencies, 
as  we  will  see  that  there  is  no  difference  from  the  semantic 
point  of  view. 
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One  of  the  major  advantages  of  the  relational  model  is 
that  relations,  their  properties  and  possible  operations  on 
them  (e.g.  join,  projection)  are  mathematically  well 
defined.  This  means  that  when  a  part  of  the  world  is 
represented  by  a  collection  of  relations,  then  relational 
theory  provides  mathematical  tools  (e.g.,  relational 
algebra)  to  derive  different  representations  of  the  world 
from  the  given  one.  For  example,  a  join  can  be  performed  if 
two  attributes  of  different  relation  are  of  interest  at  the 
same  time. 

But  the  relational  theory  gives  no  indication  about  the 
way  in  which  the  world  is  to  be  represented  by  a  collection 
of  relations.  Moreover,  important  properties  of  the 
(derived)  relations  depend  on  what  relations  were  initially 
chosen  in  order  to  represent  the  world.  For  example, 
consider  Codd’s  example  [2]  where  the  difference  between  a 
collection  of  relations  R (SUPPLIEH,  PART), 
S(PART,  PROJECT)  and  T (SUPPLIER,  PART,  PROJECT)  is  shown. 
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the  relational  model.  (Though  it  might  very  well  be 
answered  by  the  intuition  somebody  has  acquired  from 
frequent  use  of  the  relational  model.) 

The  next  question  concerns  the  differences  between 
different  forms  of  a  collection  of  relations  which  contain 
the  same  information  and  where  these  differences  come  from. 
In  particular,  one  of  the  ojectives  of  normalization  is  "to 
free  the  collection  of  relations  from  undesirable  insertion, 
update  and  deletion  dependenc ie s. "[ 3 ]  This  objective  is 
clearly  of  a  semantic  nature  and  it  is  not  easy  to  see  why 
it  is  completely  attained  by  the  syntactic  process  of 
normalization . 

Normalization  is  governed  by  functional  dependencies 
which  either  may  be  explicitly  declared  or  be  derived  by 
algebraic  rules  from  the  declared  functional  dependencies. 
But  are  functional  dependencies  adequate  for  expressing 
knowledge  about  the  world?  (Some  methods  for  data  base 
design  that  are  based  only  on  functional  dependencies  might 
suggest  this.  [1],  [10]) 

Recall  that  the  definition  of  functional  dependency  is 
usually  only  applied  to  relations  in  first  normal  form. 
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Therefore,  if  a  set  of  values  in  a  domain  B  is  associated  to 
one  value  in  a  domain  A,  then  domain  B  is  not  considered  to 
be  functionally  dependent  on  domain  A,  As  a  consequence, 
the  knowledge  (say)  that  an  employee  has  one  manager  is 
represented  by  a  functional  dependency,  but  the  knowledge 
that  an  employee  has  children  is  not.  Why?  Is  there  a 
difference  between  these  kinds  of  knowledge? 

For  the  same  reason,  set  valued  facts  can  not  be  derived 
either.  Suppose  an  employee  works  in  several  departments, 
then  EMPL  />  DEPT,  and  suppose  that  each  department  has  only 


one  contract  typ 

s. 

DEPT 

-> 

CTYPE.  Clearly 
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knowledge  about 

employees 
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are  working.  But 

it 

cannot 

be 

expressed  as  a 
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dependency. 


Another  problem  is  related  to  the  transitivity  of 
functional  dependencies.  Suppose  that  an  employee  has  one 
manager,  and  the  manager  earns  one  salary,  and  an  employee 
has  one  job  description,  and  each  job  description  contains 
one  job  name.  We  have  then: 

EMPL#  ->  MAN#  MAN#  ->  SAL,  and 
EMPL#  ->  JOBDFSCP#  JOBDESCR#  ->  JOBNAHE. 


Now  we  can  derive  the  dependencies 

EMPL#  ->  SAL,  and  EMPL#  -  >  JOBNAME. 

The  meaning  of  these  derived  dependencies  is  quite 
different.  The  first  one  denotes  the  salary  of  the  manager 
of  the  employee,  whereas  the  second  one  denotes  the  jobname 
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of  the  employee.  This  semantic  difference  is  not  taken  into 
account  by  the  functional  dependencies  alone. 

These  observations  demonstrate  that  the  functional 
dependencies  are  not  completely  adequate  for  expressing  some 
knowledge  about  the  world.  Therefore,  a  data  model  that 
expresses  a  low  level  of  semantic  knowledge  is  discussed  in 
the  next  section. 
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3.  A  Basic  Semantic  Data  Model 


In  this  section,  we  start  our  discussion  with  a  data 
model  that  represents  the  world  by  object  types  and 
relationships  among  these  object  types.  This  model  shows 
inadequate  structure,  however.  For  when  mapping  from  the 
world  into  such  a  model  an  important  difference  which  exists 
in  the  real  world  is  lost,  namely,  the  difference  as  zo 
whether  one  object  is  used  only  in  order  to  describe  another 
object,  or  whether  there  is  a  relationship  among  two  objects 


each  of  which 

exists  on 

its 

own.  We 

will  make  this 

distinction  and 

call  the 

first 

kind  of 

relationship  a 

characteristic. 

the  second 

kind 

an  association.  A.s  a 

consequence,  the 

resulting 

model 

shows  a 

structure  that 

corresponds  to  our  'semantic*  view  of  the  world.  In  this 
paper  the  model  will  be  explained  informally.  A  detailed 
discussion  of  the  model,  its  relation  to  other  models,  and 
precise  mathematical  definitions  are  given  in  [9]. 
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3.1  A.ssumptions  and  Definitions 

Let  us  assume  that  the  real  world  can  be  represented  by 
a  set  of  objects,  {ob jl} ,  (also  called  instances  of  object 
types)  which  represent  the  things  existing  in  the  world,  and 
by  a  set  of  instances  of  relationships,  {irJ  (ob jl , . . . ob jN) } , 
which  represent  interrelations  or  interactions  among  the 
objects. 

Instances  of  object  types  and  instances  of  relationships 
can  be  classified  according  to  properties  which  they  have  in 
common.  These  classes  will  be  called  object  type  and 

respectively.  A  relationship  can  only  be 
formed  of  instances  whose  corresponding  objects  belong  to 
the  same  object  type.  Instances  of  object  types  and  of 
relationships  will  be  denoted  by  small  letters,  while  object 
types  and  relationships  will  be  denoted  by  capital  letters. 

For  example,  assume  that  a  car  ”a"  has  the  colour  ”maise 
yellow”,  another  car  ”b”  has  the  colour  "metallic  red",  and 
that  there  are  other  colours,  "green",  "red",  ....  An 
object  type  CAR  is  formed  by  {a,b),  an  object  type  COLOUR  is 
formed  by  {red,  green,..},  and  an  object  type  CAR_COLOUR  is 
formed  by  (maise  yellow,  metallic  red) .  A  relationship, 
CAR_H AS_C0L00R  will  only  be  defined  between  the  object 
types  CAR  and  CAR_COLOUR. 

Note  that  there  is  a  difference  between  a  relationship 
and  a  relation  in  the  sense  of  the  relational  model. 
P  ela+- ionships  represent  fundamental  facts  about  object  types 


in  our  model.  They  are  not  to  be  further  decomposed,  and 
are  supposed  to  have  a  direct  meaning.  For  example, 
relationships  are  that  an  employee  "has"  an  employee  number 
and  a  name,  that  he  "lives  at"  an  address,  that  he  "earns"  a 
salary,  and  that  he  "is  a  friend  of"  other  employees.  On 
the  other  hand,  a  relation 

EMPLOYEE (EMPLOYEE  NUMBER,  NAME,  ADDRESS,  SALARY,  FRIEND) 
is  a  priori  without  semantic  meaning.  For  example,  what 
meaning  is  there  in  the  fact  that  the  attributes  FRIEND  and 
SALARY  appear  together? 


According  to  our  view  of  the  world,  we  can  dist 
two  hinds  of  relationships.  Let  us  first  expla 
difference  between  them  by  an  example.  Suppose 
employee  has  an  address  and  an  eye-colour.  The  addre 
eye-colour  characterize  an  employee,  and  their  instan 
only  of  interest  as  long  as  the  employee  whom 
characterize  exists  (in  the  data  base)  .  Such  a  relat 
will  be  called  a  characteristic.  On  the  other  hand, 
that  an  employee  is  a  friend  of  another  employee 
employees  exist  independently  of  one  another  whether 
they  are  friends.  Such  a  relationship  will  be  ca 
association. 


ingu ish 
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The  following  definitions  provide 
the  distinction  between  characteristics 


precise  criteria  for 
and  associations. 


Def init ion_ 1 ;  An  object  has  exactly  one  occurrence  when  a 


modification,  insertion  or  deletion  of  the  object  in  one 
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instance  of  a  relationship  in  which  it  occurs  implies  its 
modification,  insertion  or  deletion  in  all  instances  of  all 
relationships  in  which  it  occurs.  Otherwise,  there  are 
different  occurrences  of  the  object.  If  there  exist 
different  occurrences  of  an  object  such  that  its 
modification,  insertion  or  deletion  in  an  instance  of  a 
relationship  implies  the  modification,  insertion  or  deletion 
in  all  instances  of  some  other  relationships,  then  it  is  the 
same  occurrence  of  this  object  which  occurs  in  these 
relationships. 

:  An  object,  obj2,  is  called  a  depending  part 

with  respect  to  an  instance  of  a  relationship,  ir,  if  its 
existence  implies  the  existence  of  ir  and,  consequently,  the 
existence  of  an  object,  objl,  such  that  ir  (ob j 1 , ob j2) .  The 
object,  objl,  is  called  a  ruling  £art,  and  ir  is  called  an 
instance  of  a  dependinq  relationship 

Def  initipn_3:  A,n  instance  of  a  relationship,  ir  (ob  j  1 , ob  j2)  , 
is  called  an  instance  of  a  characteristic  (relationship)  if; 

1.  ir  is  a  depending  relationship,  and  objl  is  the  ruling 
part  and  obj2  is  the  depending  part  of  ir. 

2.  The  same  occurrence  of  obj2  can  only  be  the  ruling  part 
in  other  characteristics,  but  not  occur  otherwise  in 
instances  of  relationships. 

3.  objl  and  obj2  are  not  the  same  occurrence  of  an  object. 
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As  a  consequence  of  definition  3,  if  obj2  is  also  a 
depending  part  in  other  instances  of  relationships  then 
there  are  different  occurrences  of  obj2  occurring  in  them. 

The  object,  obj2,  in  definition  3  is  called  a 
2k^£§.2l®ristic  ob;iect.  ^n  object  that  is  not  a  depending 
part  in  any  instance  of  a  characteristic  relationship 
(i.e, ,  not  a  characteristic  object)  is  called  an  independent 
ob;ject . 

A.11  instances  of  relationships  that  are  not  instances  of 
characteristics  are  called  instances  of  associations. 

Now  classes  of  characteristic  and  independent  objects, 
respectively,  can  be  formed.  A.  class  of  instances  of 
relationships  all  of  which  are  instances  of  characteristics 
(resp,  associations)  is  called  a  characteristic  (resp. 
association)  . 

In  some  cases,  a  relationship  may  be  regarded  as  either 
a  characteristic  or  an  association.  Suppose  that  a  PERSON 
POSSESSES  a  C?lR.  If  we  want  to  describe  only  the  property 
of  persons  in  a  data  base,  then  POSSESSES  can  be  considered 
as  a  characteristic.  But  if  we  want  to  describe  also  that  a 
CAR  is  REGISTERED  BY  a  VEHICLE  BUREAU,  then  POSSESSES  must 
be  considered  as  an  association.  That  means  that  the  same 
part  of  the  world  can  be  modelled  from  different  points  of 
view.  But  our  precise  definitions  will  help  the  data  base 
designer  to  know  exactly  what  effects  his  choice  has. 
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3 . 2  Stucturing  the, Data  Model 

^  ^§§§  consists  of  complex  independent  object 
types,  and  associations  among  them.  A.  complex  independent 

formed  by  a  kernel  that  is  an  independent 
object  type,  id  by  all  its  characteristics,  and  the 
characteristics  of  these  characteristics,  and  so  on.  A 
simple  independent  object  type  is  an  independent  object  type 
that  has  no  ch ? --- cter istics. 

Correspondingly,  complex  Ski^acterist ics  are 
characteristic  object  types  together  with  all  their 
characteristics,  and  simple  characteristics  are 
characteristic  object  types  that  have  no  characteristics 
themselves. 

A  complex  independent  object  type  and  a  complex 
characteristic  object  type  is  formed  by  all  independent 
objects  and  characteristic  objects,  respectively,  which  are 
from  the  same  object  type  and  have  characteristics  out  of 
the  same  characteristic  object  type.  We  can  distinguish 
between  non-repeating  characteristics,  where  a  given  object 
has  only  one  characteristic  object  out  of  a  characteristic 
object  type,  and  repeating  characteristics,  where  several 
characteristics  out  of  one  object  type  may  exist. 

A  graphical  representation  of  these  definitions  can  be 
given  as  follows.  Independent  object  types  are  represented 
by  a  characteristic  object  types  by  a  associations 
by  labelled  undirected  (thick)  edges. 


and  characteristic 
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relationships 

by 

edges 

directed  toward  the 

characteristic 

object  types. 

For 

the  sake  of 

simplicity,  we 

only  label  the 

a  ssociations; 

we 

do 

not 

label  the 

characteristic 

relationships. 

Instead, 

we 

name  the 

characteristic 

r elat ionships 

by 

using 

the 

name  of  the 

corresponding 

characteristic  object  type. 


a)  b) 

Figure  1, 

Figure  la  represents  a  complex  independent  object  type.  The 
kernel  is  characterized  by  characteristics,  some  of  which 
are  complex  and  some  of  which  are  simple.  All  the  leaves 
are  simple  characteristics.  Figure  1b  represents  a  complex 
characteristic  object  type. 


The  only  connections  among  complex  independent  object 
types  are  by  associations  among  the  kernels  of  these  object 
types.  Instances  of  associations  can  be  set  up  only  between 
those  objects  which  already  exist,  but  they  can  be  deleted 
at  any  time  provided  no  additional  rules  are  given  with  the 


association. 
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Complex  independent  objects  are  the  natural  units  of 
i nsertion  and  deletion.  On  the  one  hand,  they  represent  in 
our  model  the  things  that  we  regard  as  real  objects  in  the 
world.  On  the  other  hand,  it  is  proven  in  [9]  that  the 
insertion  or  deletion  of  a  complex  independent  object  does 
not  affect  other  ones.  But  complex  independent  objects  must 
only  be  deleted  when  they  are  in  no  instance  of  an 
association. 

A  characteristic  is  a  unit  that  can  be  independently 
modified,  e.g.  an  address,  or  a  house  number  of  an  address. 
But  a  characteristic  cannot  be  deleted  in  the  same  way  as  an 
independent  object  because  the  object  that  it  characterizes 
continues  to  exist.  Deleting  a  characteristic  means  that  it 
is  undefined  or  unknown,  which  is  different  from  the  fact 
that  it  does  not  exist. 

A.S  characteristics  describe  a  complex  independent 
object,  they  are  determined  by  the  object,  i.e.,  there  is  a 
semantic  dependency  among  a  complex  independent  object  and 
its  characteristics.  For  example,  if  a  book  is  given  then 
it  determines  the  key  words  that  describe  it,  and  a  given 
employee  determines  his  employee  number  and  his  address. 

In  a  gross  view  of  the  data  base,  the  characteristics  of 
complex  independent  object  types  can  be  disregarded. 
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Furthermore,  an  object  type  can  be  regarded  in  different 
levels  of  detail. 


As  a  consequence,  we  have  the  possibility  of  regarding  a 
data  base  in  different  levels  of  abstraction.  But  even  on 
the  coarsest  level  all  possible  connections  among 
independent  object  types  are  visible. 

Until  now  we  did  not  discuss  how  a  specific  independent 
object  is  identified  in  its  object  type.  One  generally  uses 
the  characteristics  of  an  object  in  order  to  identify  an 
object,  and  calls  a  (set  of)  characteristic  object  types 
that  identify  an  object  in  its  type,  a  k©!*  When  an  object 
cannot  be  uniquely  identified  in  its  type  by  its 
characteristics  then  we  will  identify  it  by  a  key  formed  by 
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concatenating  keys  of  other  objects  with  which  it  is 
related. 

A.  theory  based  on  our  semantic  model  will  classify 
different  kinds  of  associations,  such  as,  "part  of", 
"subconcept  of",  etc.,  and  different  kinds  of 
characteristics.  (See  [6]).  This  further  classification 
will  permit  us  to  derive  more  detailed  properties  of  object 
types,  associations,  and  characteristics.  For  example,  when 
an  object  type  is  a  subconcept  of  another  object  type,  then 
characteristics  of  this  latter  type  are  characteristics  of 
the  subconcept.  This  rule  does  not  hold  generally  for 
associations,  only  for  subconcept-associations. 

A  further  classification  of  the  concepts  of  our  basic 
model  will  derive  properties  that  do  not  replace  those  which 
have  been  discussed  in  this  section,  but  that  hold  in 
addition  to  them.  That  means  that  when  a  theory  is  based  on 
the  distinction  among  object  types,  characteristics,  and 
associations,  then  it  can  build  upon  the  fundamental 
properties  of  these  concepts. 
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4.  On  the  Semantics  of  Relational  Normalization 


In  this  section,  the  data  model  and  the  results 
developed  in  section  3  will  be  applied  to  Codd's  relational 
model.  All  examples  will  refer  to  figure  2.  For  the  sake 
of  simplicity,  we  will  use  "characteristic”  instead  of 
"characteristic  object  type"  when  no  misunderstanding  is 
possible. 


COURSE#  JOBDESCR# 

,  --y- 


TRAINING  TOPIC*  JOBDESCR  JOBNAME 


\  ■ 

DATE  SALARY  STREET  TOWN  HOUSE#  FIRST__NAME 

Repeating  characteristic  object  types  are  indicated  by  *. 


Basic  Semantic  Data  Model 
Figure  2.  Contrasting  data  models 
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EMPLOYEE  (EMPL#,  FIBST_NAME,  LAST^NAME,  STREET,  TOWN, 

HOUSE#,  DATE_OF_BIETH,  SAL,  JOBDESCR#, 

COURSE#) 

JOBDESCR  (JOBDESCR#,  JOBNAME) 

SALHISTORY  (EMPL#,  DATE,  SALARY) 

MAN  (MANlr  MAN_NAME) 

CUST  (CUST#,  CUST_NAME) 

SALE  (CUST#,  EMPL#) 

TRAINING  (COURSE#,  TOPIC) 

CHILDREN  (EMPL#,  CHILD  NAME) 

MANAGED  (EMPL#,  MAN#) 

Relational  model  in  third  normal  form 
Figure  2,  (Continued) 


4 . 1  Semantic  Dependencies .in^  the_Pe lational_Model 

In  the  following  we  will  analyze  how  semantic 
dependencies  are  represented  by  functional  dependencies  in 
the  relational  model.  Recall  that  domains  in  normalized 
relations  correspond  to  classes  of  simple  objects  in  the 
basic  semantic  model,  and  each  value  in  the  domain 
corresponds  to  a  simple  object. 

Let  us  consider  first  associations  among  independent 
object  types.  There  are  no  semantic  dependencies  and,  as  a 
conseguence,  in  general  no  functional  dependencies.  Neither 
is  it  the  case  that  all  objects  of  one  object  type  need  to 
participate  in  an  association,  e.g.  not  every  employee  must 
be  associated  with  a  customer.  That  means  that  a  relation 
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representing  an  association  is  not  non-loss  joinable  with 
other  relations  on  a  domain  formed  by  the  key  of  an 
independent  object  type. 

But  in  the  relational  model,  it  is  possible  that 
additional  time  independent  rules  resulting  in  a  functional 
dependency  or  in  non-loss  joinability  are  imposed  on  an 
association.  For  example,  a  rule  can  indicate  that  every 
employee  must  have  exactly  one  manager. 

Next  let  us  consider  characteristics  within  complex 
independent  object  types. 

A.n  instance  of  a  complex  independent  object  type,  e.g. 
EMPL  in  figure  2,  determines  necessarily  all  its 
characteristic  object  types,  independently  of  whether  they 
are  repeating  ones  or  not.  It  is  identified  by  one  or  more 
candidate  keys,  each  of  which  may  consist  of  instances  of 
either  one  simple  characteristic  object  type,  e.g.  EMPL#,  or 
a  set  of  them  e.g. ,  (NAME,  ADDRESS,  DATE  OF  BIRTH) .  As  a 
consequence,  the  keys  determine  the  characteristics  of  the 
object,  albeit  in  an  indirect  way. 

It  is  also  possible  for  a  complex  characteristic  object 
type  to  have  its  own  key  .  For  example,  JOBDESCR#  is  a  key 
for  a  JOBDESCR.  Such  a  key  will  be  called  a  characteristic 
key  in  the  sequel.  Note  that  the  characteristics  of  which 
the  complex  characteristic  is  composed  are  semantically 
determined  by  both  the  characteristic  key  and  the  candidate 
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keys  of  the  object  possessing  the  characteristics  ,  e.g. 
JOBNA.ME  by  JOBDESR#  and  EMPL#. 

Apart  from  these  two  cases,  another  case  must  be 
considered.  If  one  of  the  candidate  keys  is  composite,  then 
a  characteristic  may  determine  one  of  the  key’s 
constituents.  For  example,  imagine  a  post  office  file  where 
municipality  and  street  address  form  a  composite  key  and 
postal  code  is  a  characteristic.  If  postal  codes  are  only 
assigned  within  municipalities  (as  in  Canada) ,  then  the  code 
determines  the  municipality.  In  the  following  it  is 
supposed  that  no  semantic  dependencies  of  that  kind  exists. 

Let  us  now  consider  non-repeating  simple 

characteristics,  S.  They  are  obviously  functionally 
dependent  on  candidate  keys,  P-KEY,  as  for  every  key  there 
is  only  one  simple  characteristic  in  its  object  type.  We 
write 

P-KEY  ->  S. 

In  our  example,  SAL  and  DATE_OF_BIRTH  are  such 
character! sties. 

A  complex  characteristic  is  represented  in  a  normalized 
relation  by  splitting  it  up  into  simple  characteristics. 

If  a  complex  characteristic  has  no  characteristic  key 
then  it  must  be  split  up  into  characteristics,  some  of  which 
are  simple.  Si,  and  some  of  which  are  complex  with  keys,  C- 
KEYj.  In  our  example,  NAME  can  be  split  into  FIEST_NAME  and 
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LAST_NAME.  These  simple  characteristics  are  functionally 
dependent  on  the  primary  key(s) 

P-KEY  ->  Si. 

If  a  complex  characteristic  itself  has  a  characteristic 
key,  C-KEY,  e.g,,  JOBDESCP#  in  the  complex  characteristic 
JOBDESCP,  then  this  key  is  functionally  dependent  on  the 
candidate  kGy(s) 

P-KEY  ->  C-KEY 


The  same  considerations  as  were  applied  to  a  complex 
independent  object  type  can  now  be  applied  to  a  complex 
characteristic,  C,  with  a  characteristic  key,  C-KEY.  When 
the  characteristics  of  C  can  be  split  up  into  non-repeating 
simple  characteristics,  e.g.,  JOENAME  in  JOBDESCP,  or  have  a 
non-repeating  characteristic  with  a  key, 

C-KEYj,  then 
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C-KEY  ->  Sj,  C-KEY  ->  C-KEYj. 

Furthermore,  as  we  remarked  earlier, 

P-KEY  ->  Sj,  P-KEY  ->  C-KEYj. 

Clearly  this  process  can  be  repeated. 

Characteristics,  S,  Si,  and  C-KEY  are  called  directly 
determined  characteristics.  That  means  that  S,  Si,  and  C- 
KEY  are  not  determined  by  any  other  key  than  P-KEY. 

A.ccording  to  our  assumptions,  there  are  no  other  non¬ 
trivial'  functional  dependencies  within  complex  independent 
object  types. 

Repeating  characteristics  do  not  result  in  functional 
dependencies  as  for  every  key  there  may  be  more  than  one 
simple  characteristic  object  in  its  object  type.  Otherwise 
the  same  considerations  apply  as  in  the  case  of  non¬ 
repeating  characteristics.  If  we  denote  the  corresponding 
sets  by  a  *,  then  for  repeating  characteristics, 

P-KEY  />  S* 

P-KEY  y>  Si* 

P-KEY  y>  C*-KEY 
C-KEY  />  Sj* 

C-KEY  />  C*-KEYj. 


'  Trivial  functional  dependencies  are  those  which  arise 
from  projections,  e.g.,  (FIRST, LAST)  ->  LAST. 


-25 


Perhaps  other  kinds  of  semantic  dependencies  not  covered 
by  the  cases  discussed  above  could  be  constructed.  Such 
possibilities  will  be  disregrarded  because  either  they  seem 
to  be  simple  generalizations  of  the  cases  mentioned  above, 
or  they  do  not  seem  to  be  of  practical  importance.  There  is 
then  no  difference  between  the  two  third  normal  form 
definitions  (compare  [3],  [5]).  We  will  use  the  Boyce-Codd 
third  normal  form  in  a  proof  because  it  is  simpler  to  deal 
with. 


Let  us  restate  briefly  the  results  of  this  section.  By 
the  transition  to  functional  dependencies,  some  knowledge  of 
the  semantic  dependencies  among  objects  has  been  lost. 
Instead,  purely  syntactic  functional  dependencies  resulting 
from  rules  additionally  imposed  on  associations  have  been 


added . 
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4 . 2  Semantics  of  Third  Normal  Form 

Each  subgraph  in  the  graph  representation  of  our 
semantic  data  model  can  be  considered  as  a  relation  in  the 
relational  model.  (Some  of  these  subgraphs  may  not  be  very 
meaningful,  though.) 

When  a  complex  independent  object  type  is  represented 
entirely  by  one  relation  (which  may  contain  associations 
also)  then  the  transition  to  third  normal  form  will  perhaps 
split  up  the  relation.  On  the  other  hand,  when  a  complex 
independent  object  type  is  represented  by  several  relations 
then  each  such  relation  must  contain  at  least  one  of  the 
object’s  candidate  keys  or  one  of  the  characteristic  keys. 
By  normalization  to  the  third  normal  form  one  will  obtain 
(as  is  described  later  in  detail)  subrelations  each  of  which 
contains  the  candidate  key(s)  or  characteristic  key(s)  and 
which  describes  the  object  partially.  Those  subrelations 
that  describe  the  same  object  type  are  non-loss  joinable. 
By  the  transition  to  optimal  third  normal  form,  they  will  be 
collected  into  one  relation.  This  leads  to  Theorem  1. 

Theorem_1.2  Relations  that  are  in  optimal  third  normal 
form  can  be  subdivided  into  five  different  types: 

1.  Each  relation  represents  an  independent  object  type. 
The  relation  then  contains  all  candidate  keys,  {P-KEY} 
of  the  type,  all  non-repeating  simple  characteristics, 
{S}  and  {Si}  , 


which  are  directly  determined  by  {P- 
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KEY} ,  and  all  non-repeating  characteristic  keys,  {C- 
KEY} ,  directly  determined  by  {P-KEY} .  (See  remark  2 
below. ) 

2.  Each  relation  represents  a  repeating  characteristic 
object  type.  It  contains  a  candidate  key,  P-KEY,  and 
one  repeating  characteristic  object  type  denoted  by  a 
repeating  simple  characteristic,  S*,  or  repeating  simple 
characteristic  group.  Si*,  or  one  repeating 
characteristic  key,  C*-KEY.  (See  remark  3  below.) 

3.  Each  relation  represents  a  complex  characteristic  object 
type  with  a  key.  It  contains  one  or  more  eguivalent 
characteristic  keys,  {C-KEY} ,  and  all  non-repeating 
simple  characteristics,  {Sj} ,  and  characteristic  keys, 
{C-KEYj} . 

4.  Each  relation  represents  a  repeating  complex 
characteristic  object  type  with  a  key.  It  contains  a 
candidate  characteristic  key,  C-KEY,  and  one  repeating 
characteristic  object  type  denoted  by  a  simple 
characteristic,  S*,  or  repeating  simple  characteristic 
group,  Sj*,  or  one  repeating  characteristic  key,  C*-KEY. 

5.  Each  relation  represents  an  association.  When  the 

association  is  functional,  then  the  relation  has  a 
candidate  key,  P-KEY,  as  a  key.  (See  remark  2  below.) 

When  it  is  non-functional,  then  the  key  is  formed  as  a 
composite  of  all  the  candidate  keys  of  the  complex 
independent  object  types  that  are  in  the  association. 
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The  proof  will  be  given  in  the  Appendix. 


Remarks:  1.  The  relations  of  the  third  and  fourth  type 
correspond  in  their  structure  to  those  of  the  first  and 
second  type.  They  simply  describe  complex  characteristic 
object  types  as  opposed  to  complex  independent  object  types. 

2.  The  optimality  criterion  requires  that  the 
number  of  relations  be  as  small  as  possible  in  third  normal 
form.  According  to  it,  non-loss  joinable  functional 
associations  would  be  combined  on  a  P-KEY  with  descriptions 
of  a  complex  independent  object  type  having  the  same  P-KEY. 
That  is,  a  relation  of  type  1  would  be  joined  with  a 
corresponding  relation  of  type  5  resulting  in  a  relation 
which  would  contain  additionally  one  candidate  key  cf  each 
of  the  associated  object  types.  However,  we  believe  that  it 
is  important  to  distinguish  clearly  between  associations  and 
characteristics.  We  recommend  that  the  optimality 
criterion,  which  is  important  for  collecting  together  as 
many  characteristics  as  possible  into  a  relation,  not  be 
applied  to  non-loss  joinable  functional  associations. 

3..  When  the  optimality  criterion  mentioned  above  is 
applied  to  relations  of  type  2  or  4,  then  one  relation  would 
contain  all  repeating  characteristic  object  types  that 
characterize  one  complex  object  type.  As  a  consequence,  the 
Cartesian  product  of  all  characteristic  objects  in  the 
different  object  types  has  to  be  created.  Instead,  we 
recommend  using  one  relation  for  each  repeating 
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characteristic 

object  type.  The 

following 

exam  pie 

shows  the 

difference  in 

the  two  cases: 

EMPL# 

1  CHILD_NAME  | 

1  1 

DATE 

1 

1 

SAL 

1 

\  1 

1  Hans  I 

70 

1 

1 

10000 

1 

1  Dick  I 

70 

1 

10000 

1 

1  Hans  1 

72 

1 

12000 

1 

1  Dick  ( 

72 

1 

12000 

1 

1  Hans  1 

73 

1 

15000 

1 

1  Dick  1 

73 

1 

15000 

2 

1  Eva  1 

70 

20000 

EMPL#  I 

1 

CHILD_NAME 

EMPL# 

1 

1 

DATE 

1  SAL 

1 

1 

1  1 

Hans 

1 

1 

1 

70 

i  10000 

1  ! 

Dick 

1 

1 

72 

1  12000 

2  1 

Eva 

1 

1 

73 

1  15000 

2 

1 

70 

1  20000 

4.  This  classification 

of  relati 

ons  into 

different 

types  is  comprehensive  with  respect  to 

our 

basic  data  model. 

Each  concept  introduced  by  the  model  is  covered  by  a  type  of 
relation  and  vice  versa. 

But  there  are  concepts  which  are  not  discussed  in 
the  data  model,  e.g. ,  relationships  among  relationships. 
This  will  be  discussed  in  [9].  It  seems  that 
characteristics  '  of  associations  or  characteristics 
themselves  are  of  some  interest.  For  example,  consider  a 
relationship 

SUPPLY  (PAPT,  SUPPLIER,  QUANTITY) 
where  the  domain  QUANTITY  indicates  the  possible  quantity  in 
which  a  SUPPLIER  can  SUPPLY  a  PART.  The  object  types  PART 
and  SUPPLIER  are  in  the  association  SUPPLY  which  has  the 
characteristic  QUANTITY. 
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Though  these  additional  concepts  are  not  covered  by 
the  five  types  of  relations,  it  should  be  pointed  out  that 
characteristics  of  a  relationship  may  be  represented  within 
the  same  relation  as  that  which  represents  the 
relationship.  Thus,  characteristics  of  characteristics 
would  be  represented  within  relations  of  types  1-4,  while 
characteristics  of  an  association  would  be  represented 
within  a  relation  of  type  5. 


We  will  use  Figure  2 
Theorem  1 , 


to  illustrate  the  relation  types  of 


The  complex  independent  object  types  in  the  example  are 
EMPL,  CUST,  and  MAN 

while  all  the  remaining  object  types  are  characteristic 
object  types.  The  complex  independent  object  types  EMPL 
and  CUST  are  in  an  association  SALE  ,  and  EMPL  and  MAN 

are  in  a  functional  non-loss  joinable  association,  MANAGED. 
Both  of  these  associations  lead  to  relations  of  type  5. 


SAL,  DATE_OF_BIPTH,  and  EMPL#  are  simple  non-repeating 
characteristics  with  EMPL#  being  a  candidate  key.  These 
and  the  simple  characteristics  which  are  grouped  into  the 
(keyless)  complex  characteristic  types,  NAME  and  ADDRESS 
must  be  combined  into  a  type  1  relation  keyed  on  EMPL#. 
Furthermore,  the  keys  of  the  complex  characteristic  types 
JOBDESCR  and  TRAINING  will  be  included  in  the  type  1 

relation  keyed  on  EMPL#.  These  two  complex  characteristic 
object  types  will  form  relations  of  a  different  type. 
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EMPL#  and  the  repeating  simple  characteristic, 
CHILD_NAME,  form  a  relation  of  type  2.  EMPL#  and  the 
repeating  complex  characteristic,  SAL_HISTORY,  form  also  a 
relation  of  type  2  over  the  domains  EMPL#,  SAL,  and 
SAL_DATE. 

JOBDESCE  being  a  non-repeating  characteristic  type  with 
a  key,  JOBDESCP#,  and  a  simple  characteristic,  JOBNAME, 
forms  a  type  3  relation. 

Finally,  TRAINING,  being  a  characteristic  object  type 
with  characteristic  key,  COURSE#,  and  a  repeating  simple 
characteristic,  TOPIC,  forms  a  type  4  relation. 

Theorem  1  shows  from  where  the  important  benefits  of 

normalization  originate.  The  most  important  benefit  is  that 
normalization  separates  the  descriptions  of  complex 
independent  object  types  from  each  other,  and  from 
associations.  Conversion  to  second  normal  form  accomplishes 
this  separation  for  object  descriptions  that  are  combined 
with  non-functional  associations.  For  example,  the  relation 
(EMPL#,  DATE_OF_BIRTH,  SAL,  CUST#,  COST_NAME) 
is  split  up  into  the  three  relations: 

(EMPL#,  DATE_OF_BIRTH,  SAL) 

(COST#,  CUST_NAME) 

(EMPL#,  COST#) 

each  of  which  represents  either  a  complex  independent  object 
type  or  an  association. 
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However,  conversion  to  second  normal  form  does  not  split 
up  the  relation 

(EMPL#,  DATE_OF_BIRTH,  SAL,  MAN#,  MAN_NAME) 
where  EMPL#  ->  MAN#  and  every  employee  has  a  manager  (i.e., 
non-loss  joinable!).  Only  conversion  to  third  normal  form 
splits  up  functional  associations.  The  result  is  two 
relat ions : 

(EMPL#,  DATE_OF_BIRTH,  SAL,  MAN#) 

(MAN#,  MAN_NAME) . 

As  we  remarked  earlier,  our  opinion  is  that  the 
association  between  employee  and  manager  should  be  separated 
from  the  description  of  the  employee.  The  collection  of 
relations  then  becomes: 

(EMPL#,  DATE_OF_BIRTH,  SAL) 

(EMPL#,  MAN#) 

(MAN#,  MAN_NAME) . 

After  the  separation  that  is  caused  by  normalization, 
every  complex  independent  object  and  every  instance  of  an 
association  is  represented  only  once  in  the  collection  of 
relations.  This  is  the  reason  why  normalization  eliminates 
insertion  and  deletion  dependencies  resulting  from  multiple 
representations  of  a  complex  independent  object  or  an 
association  in  a  relation. 


Other  benefits  with  respect  to  characteristics  result 
from  normalization.  Conversion  to  second  normal  form 
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separates  non-repeating  from  repeating  characteristics, 
whereas  conversion  to  third  normal  form  separates 
characteristics  that  possess  (characteristic)  )ceys. 
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4 . 3  Insertion -Deletion  Pules  for  B^at  ions  in  0£timal  Third 
Normal  Form 


In  cases  when  a  complex  independent  object  type  is 
described  completely  by  non-repeating  simple  or  complex 
characteristics  without  characteristic  keys,  then 
normalization  delivers  the  complete  description  of  an  object 
type  by  a  relation  of  type  1.  Practically  all  examples 
given  in  the  literature  about  relational  normalization  are 
of  that  kind.  In  such  a  case,  a  tuple  in  that  relation  is  a 
meaningful  unit  for  insertion  or  deletion,  because  it 
represents  completely  a  complex  independent  object. 


Pules  that  concern  the  modification  of  one  tuple  in  a 
relation  have  been  discussed  by  Heath  [8],  But  even  in  the 
simple  case  when  a  complex  independent  object  is  completely 
represented  by  a  tuple  of  a  relation  in  third  normal  form 
two  additional  rules  must  be  observed.  They  concern  the 
(interrelational)  influences  of  tuples  in  different 
relations.  From  section  3  we  know  that  when  independent 
objects  are  deleted  then  they  must  be  in  no  associations. 
This  leads  to  rule  1. 


Pule  1;  A 
if  none  of 
tuple  of  a 


tuple  in  a  relation 
its  candidate  keys 
relation  of  type  5. 


of  type 
is  con 


1  can 
ained 


be  deleted  only 
in  any  other 


In  figure  2,  an  employee  can  only  be  deleted  if  he  is 
not  in  an  association  with  a  customer  in  the  relation  SALE, 
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or  a  manager  can  only  be  deleted  if  he  is  not  managing  an 
employee  (in  relation  MANAGED) , 

An  inverse  rule  is  valid  for  the  insertion  of 
associations  as  associations  can  only  be  made  between 
existing  independent  objects. 


Rule_2_^  A  tuple  may  be  inserted  into  a  relation  of  type  5 
only  if  all  foreign  keys  of  the  tuple  exist  already  as 
candidate  keys  in  relations  of  type  1, 


In  figure 
relation  SALE 
exist , 


2,  a  tuple  can  only  be  inserred  into  the 
f  if  the  corresponding  employee  and  customer 


When  a  complex 
characteristics  with 
its  description  is  d 
From  section  3  we  kno 
must  be  inserted  or 
one  time.  In  the 
characteristics  with 
additional  relation  o 
relation  JOB  DESCB. 


independent  object  type  has  complex 
keys  or  repeating  characteristics,  then 
istributed  over  more  than  one  relation, 
w  that  a  complex  independent  object 
deleted  with  all  its  characteristics  at 
case  that  it  has  non-repeating 
keys,  then  for  each  key  there  is  an 
f  type  3.  Compare  in  figure  2  the 
Then  the  following  rule  applies: 


Rule  3:  If  a  relation 
that  are  candidate  keys 
corresponding  tuples  in 


of  type  1  or  3  contains  foreign  keys 
in  a  relation  of  type  3,  then  the 
all  these  relations  must  be  inserted 


or  deleted  at  the  same  time 
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For  repeating  characteristics,  it  is  also  true  that  they 
are  to  be  inserted  or  deleted  at  one  time  with  the  complex 
independent  object  which  they  characterize.  But  now  several 
tuples  that  form  a  repeating  chatacteristic  are  affected  by 
an  insertion  or  deletion.  This  leads  to  the  following  rule. 

Buie  4;  If  a  candidate  key  of  a  relation  of  type  1  (resp.  3) 

is  contained  in  a  relation  of  type  2  (resp.  4)  then 

insertion  of  tuples  in  the  type  2  (resp.  4)  relation  must  be 

preceded  by  an  insertion  of  a  corresponding  tuple  in  the 
type  1  (resp.  3)  relation.  The  reverse  is  the  case  when  a 
deletion  is  performed. 

Note  that  modifications  of  a  repeating  characteristic, 
i.e.  insertion  or  deletion  of  only  some  of  the  tuples  that 
form  the  repeating  characteristic,  are  not  taken  into 
account  by  rule  4. 

Based  on  the  semantic  data  model,  an  insertion-deletion 
theory  for  a  multirelational  data  base  has  been  developed. 
Modifications  are  to  be  regarded  in  the  same  framework.  A 
modification  of  an  association  must  be  considered  as  a 
deletion  followed  by  an  insertion.  In  the  case  of 

characteristics,  the  modification  of  a  foreign  key  in  a 
relation  of  type  1  or  3  has  implications  for  tuples  with  the 
same  key  in  other  relations  of  type  3,  and  possible  further 
implications  for  tuples  in  relations  of  type  4.  The 
underlying  semantic  principle  is  that  the  modification  of  a 
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characteristic  has  influences  on  sub-characteristics,  and 
vice  versa.  This  will  not  be  discussed  in  more  detail  here. 
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5.  Consequence  for  Relational  Data  Base  Description 

A  relational  data  base  is  described  on  the  schema  level 
by  a  data  model  definition.  It  defines  for  each  relation: 
its  name,  all  attribute  names  and  the  primary  key,  and 
furthermore,  the  underlying  domains[7].  Additionally, 
information  about  time  independent  constraints  such  as 
functional  dependencies,  set  inclusion  declarations  and 
dynamic  integrity  constraints  is  given. 

As  a  consequence,  the  description  of  a  data  base  by  a 
relational  schema  is  very  flat.  That  means  that  the 
structure  inherent  to  a  certain  data  base  is  contained  only 
implicitly  in  the  schema;  it  cannot  be  recognized  easily. 

There  are  two  types  of  structure  which  are  hidden  in  a 
relational  data  model  definition: 

1.  Intrarelational:  Existing  natural  hierarchical  structure 

of  characteristics  (e.g.  NAME,  ADDRESS)  is  lost  because 
complex  characteristics  must  be  split  up  into 

attributes.  It  is  hard  to  understand  what  the 
collection  of  attributes 

STREET,  LAST_NAME,  HOUSE#,  FIRST_NAME,  TOWN 
means. 

2.  Interrelational:  As  there  is  only  one  type  of  relational 
structure  one  cannot  distinguish  whether  a  relation 
describes  a  complex  independent  object  type  (as  do 
relations  of  type  1  and  2) ,  a  complex  characteristic 
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object  type  (as  do  relations  of  type  3  and  4) ,  or  an 
association  between  independent  objects  (as  do  relations 
of  class  5)  . 

This  loss  of  structure  has  two  consequences: 

a)  Without  knowing  at  least  intuitively  to  which  type 
a  relation  belongs,  no  consistent  modifications, 
insertions  or  deletions  of  tuples  in  the  relation 
can  be  made.  This  is  because  the  consequences  of  a 
modification  on  other  relations  are  different 
according  to  their  types.  (Compare  rules  1  -  4.) 

b)  Questions  in  a  data  base  can  be  based  either  on  a 

commonality  of  underlying  domains,  or  on  an 

explicit  connection  between  domains.  In  the  latter 
case,  possible  connections  among  domains  of 

different  relations  must  be  recognized.  But  this 
may  be  rather  difficult  in  the  relational  model. 
Consider  the  relational  data  model  definition  in 
Figure  2,  and  suppose  that  the  domain  TOPIC  in 

the  relation  TRAINING  and  the  domain  JOBNAME  in 

the  relation  JOBDESCR  are  given.  How  are  they 

connected?  Either  these  attributes  could  be 

foreign  keys  themselves,  whence  we  must  look  for 
where  they  occur  in  other  relations,  or  a 

connection  could  go  through  the  primary  keys, 
whence  we  must  look  for  those  occurrences. 

Furthermore,  the  connections  could  go  through 


several  relations. 
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In  the 
data  base 
attributes 


basic  semantic  model  more  of  the  structure  of  the 
is  shown  explicitly  and  connections  between 
are  seen  more  easily. 


Based  on  the  semantic  model,  two  alternatives  are 
proposed  as  to  how  a  relational  data  model  definition  should 
be  specified.  These  proposals  maintain  the  advantages  but 
avoid  the  disadvantages  of  the  relational  model. 


In  the 

first 

approach , 

a  two  level 

network  is 

superimposed 

onto 

a  relational  schema. 

one  level 

representing 

the 

associat ions , 

the  other 

representing 

separately  each  complex  object.  From  both  levels,  there  are 
references  to  corresponding  relations  in  the  relational 
schema  and  vice  versa. 


I 

possi 

group 

relat 


n  this  approach,  the  network 
ble  connections  among  relations  and 
ing  of  attributes.  But  all 
ional  model  are  preserved. 


shows  clearly  the 
allows  hierarchical 
advantages  of  the 


In  the  second  approach,  the  structural  information 
provided  by  a  grouping  of  relations  in  declarations  w 
indications  of  their  semantic  type.  We  denote  this  in 
example  below  by  (#)  ,  Further  syntactic  means  sho 

permit  attributes  to  be  grouped  in  a  relation.  Let 
demonstrate  using  the  data  model  definition  of  Figure  2. 


is 

ith 

our 

uld 

us 
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COMPLEX  INDEPENDENT  QBJEC^ 


EMPL 


COST 


MAN 


WITH  KEY  EMPL# 

IN  ASSOCIATIONS  SALE (5),  MANAGED(5) 

niscif^D  BY  ""employee  (1) 

CHILDREN  (2) 

SAL_HISTORY  (2) 

JOB  DESCR(3) 

TRAINING (4) 

WITH  KEY  COST# 

IN  ASSOCIATIONS  SALE (5) 

described  by  “cust 

WITH  K^  MAN# 

IN  associations  managed (5) 

DESCPIBED^BY  MAN 


ASSOCIATIONS 


SALE(5) 

WITH 

KEY 

(EMPL#,  CUST#) 

MANAGED (5) 

WITH 

KEY 

EMPL# 

RELATIONS 

EMPLOYEE (1) 

WITH 

KEY 

EMPL# 

(EMPL#, 

NAME 


FIRST_NAME 

LAST_NAME 

ADDRESS 

TOWN 

STREET 

HOUSE# 

DATE_OF_BIRTH 

SAL 

JOBDESCR# 

COURSE#) 

CHILDREN (2)  WITH  KEY  EMPL#,  CHILD  NAME 

TeMPL# ,“CHILD_NAME) 

SAL_HISTORY(2)  WITH  KEY  EMPL#,  DATE 

(EMPL# 

SAL_HISTORY 

DATE 

SALARY) 
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DOMAINS 

EHPL#  IlilEGEE(IO) 

FIRST_NAME  &APACTEP(20) 

The  grouping  of  attributes  may  be  shown  in  other  ways, 
e.g,  by  using  nested  brackets  in  a  linear  notation. 
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6 .  Conclusions 

It  has  been  demonstrated  that  when  the  relational  model 
is  considered  purely  mathematically  then  some  questions  of  a 
semantic  nature  are  left  open.  In  order  to  resolve  these 
questions  a  basic  semantic  data  model  was  introduced. 
Pesults  on  the  structurinq  of  data  were  then  derived.  They 
apparently  corresponded  to  what  we  regard  as  "real  world" 
structure . 

?lpplying  these  results  to  the  relational  data  model,  the 
concepts  of  functional  dependency  and  of  normalization  were 
analyzed  from  a  semantic  point  of  view.  In  particular,  it 
was  shown  that  normalization  delivers  apart  from  one 
exception,  relations  such  that  each  of  them  represents  one 
of  the  concepts  developed  in  the  basic  semantic  data  model. 
This  yielded  five  different  types  of  relations  in  optimal 
third  normal  form.  Insertion-deletion  rules  derived  from 
the  basic  model  were  applied  to  these  types. 

Finally,  a  form  was  suggested  for  the  specification  of 
the  relational  data  model  which  incorporated  the  knowledge 
about  the  basic  semantic  structure  of  a  data  base. 
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8 .  Appendix : 

Here  we  present  a  proof  of  Theorem  1  of  section  4.2. 

a  result  of  our  earlier  assumptions,  we  may  use 
the  Boyce-Codd  definition  [5]  of  third  normal  form  in  our 
ptoof . 

a)  If  there  is  a  relation  from  one  of  these  types  then  it 
is  in  Boyce-Codd  optimal  third  normal  form.  It  can  easily 
be  seen  that  according  to  the  functional  dependencies 
derived  in  4.1: 

in  the  first  and  third  type  simple  characteristics  and 
characteristic  keys  are  functionally  dependent  only  on 
the  primary  and  characteristic  keys,  respectively. 

in  the  second,  and  fourth  types  there  are  no  functional 
dependencies  at  all. 

in  the  fifth  type  there  are  no  functional  dependencies 
or  there  are  functional  associations. 

b)  If  a  relation  is  in  Boyce-Codd  optimal  third  normal 
form,  then  it  is  in  one  of  these  types. 

Suppose  a  relation  contains  a  candidate  key.  Then  it 
contains  all  candidate  keys  because  of  the  optimality 
requirement  and  because  keys  are  not  transitively 
dependent  on  each  other.  It  must  contain  all  other  non¬ 
repeating  simple  characteristics  and  characteristic  keys 
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for  the  same  reasons.  It  must  not  contain 
characteristics  that  are  dependent  both  on  the  primary 
keys  and  a  characteristic  key  (because  of  exclusion  of 
strictly  transitive  dependence  [3]).  As  there  are  no 
further  functional  dependencies  on  the  primary  keys  this 
relation  is  in  Boyce-Codd  optimal  third  normal  form,  and 
of  the  first  type.  The  same  argument  applies  for 
relations  of  the  third  type. 

Suppose  a  relation  contains  the  candidate  key  and  one 
repeating  characteristic  or  characteristic  key.  As  a 
consequence,  it  is  of  type  2  or  4,  respectively. 

If  a  relation  is  not  covered  by  the  previous  two  cases, 
then  it  must  be  a  relation  containing  a  candidate  key 
and  either  representing  a  non-functional  association,  or 
representing  a  functional  association.  This  is 
precisely  type  5. 
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f  M.  Sc.  Thesis,  DCS  1972  ] 

CSRG-14  THE  OSE  OF  SERVICE  TIME  DISTRIBDTIONS  IN  SCHEDULING 
Kenneth  C.  Sevcik,  May  1972 

[Ph.D.  Thesis,  Committee  on  Information  Sciences, 
University  of  Chicago,  1971;  JACM,  January  1974] 


CSRG-15  PROCESS  STRUCTURING 

J. J.  Horning  and  B.  Randell,  June  1972 
[ACM  Computing  Surveys,  March  1973] 

CSEG-16  OPTIMAL  PROCESSOR  SCHEDULING  WHEN  SERVICE  TIMES  APE 

HYPEREXPONENTIALLY  DISTRIBUTED  AND  PREEMTION  OVERHEAD 

IS  NOT  NEGLIGIBLE 

Kenneth  C.  Sevcik,  June  1972 

[Proceedings  of  the  Symposium  on  Computer-Communication, 
Networks  and  Teletraffic, 

Polytechnic  Institute  of  Brooklyn,  1972] 

CSRG-17  PROGRAMMING  LANGUAGE  TRANSLATION  TECHNIQUES 
W.M.  McKeeman,  July  1972 

CSPG-18  A  COMPARATIVE  ANALYSIS  OF  SEVERAL  DISK  SCHEDULING 
ALGORITHMS 

C, J.M,  Turnbull,  September  1972 


CSRG-19  PROJECT  SUE  AS  A  LEARNING  EXPERIENCE 
K.C.  Sevcik  et  al,  September  1972 

[Proceedings  AFIPS  Fall  Joint  Computc‘r  Conference, 
V.  41,  December  1972] 

CSRG-20  A  STUDY  OF  LANGUAGF  DIRECTED  COMPUTER  DESIGN 
David  B.  Wortman,  December  1972 
[Ph.D,  Thesis,  Computer  Science  Department, 
Stanford  University,  1972] 

CSRG-21  AN  APL  TERMINAL  APPROACH  TO  COMPUTER  MAPPING 
R.  Kvat ernik ,  December  1972 
[M.Sc.  Thesis,  DCS  1972] 

CSRG-22  AN  IMPLEMENTATION  LANGUAGE  FOR  MINICOMPUTERS 
G.G,  Kalmar,  January  1973 
[M.Sc.  Thesis,  DCS  1972] 


CSPG-23  COMPILER  STRUCTURE 

W.M.  McKeeman,  January  1973 

[Proceedings  of  the  USA-Japan  Computer  Conference,  1972] 

CSRG-24  an  ANNOTATED  BIBLIOGRAPHY  ON  COMPUTER  PROGRAM 
ENGINEERING 

J.D.  Gannon  (ed.),  March  1973 


CSRG- 

-25  THE  INVESTIGATION  OF  SERVICE  TIHE  DISTRIBHTIONS 

Eleanor  A.  Lester,  April  1973 
[M.Sc.  Thesis,  DCS  1973  ] 

♦  CSRG- 

-26  PSYCHOLOGICAL  COMPLEXITY  OP  COMPUTER  PROGRAMS: 

AN  INITIAL  EXPERIMENT 

Larry  Weissman,  August  1973 

*  CSRG- 

-27  STFUCTUPFD  SUBSETS  OF  THF  PL/I  LANGUAGE 

Richard  C.  Holt  and  David  B.  Nortman,  October  1973 

♦  CSPG- 

•28  ON  THE  REDUCED  MATRIX  REPRESENTATION  OF  LR(k) 

PARSER  TABLES 

Marc  Louis  Joliat,  October  1973 
[ Ph.D.  Thesis,  EE  1973] 

♦  CSRG- 

-29  A  STUDENT  PROJECT  FOR  AN  OPERATING  SYSTEMS  COURSE 

B.  Czarnik  and  P.  Tsichritzis  (eds.)»  November  1973 

♦  CGPG- 

-30  A  PSEUDO-MACHINE  FOP  CODE  GENERATION 

Henry  John  Pasko,  December  1973 
tM.Sc.  Thesis,  DCS  1973] 

♦  CGPG- 

-31  AN  ANNOTATED  BIBLIOGRAPHY  ON  COMPUTER  PROGRAM 
ENGINEERING 

J.D.  Gannon  (cd.).  Second  Edition,  March  1974 

GGRG- 

•32  SCHEDULING  MULTIPLE  RESOURCE  COMPUTER  SYSTEMS 

E.D.  Lazowska,  May  1974 
[M.Sc.  Thesis,  DCS  19lU] 

=»■  CSRG- 

-33  AN  EDUCATIONAL  DA-^A  BASE  MANAGEMENT  SYSTEM 

E.  Lochovsky  and  D.  Tsichritzis,  May  1974 

’t'  CS^G- 

•34  ALLOCATING  STORAGE  IN  HIERARCHICAL  DATA  BASES 

P.  Bernstein  and  D.  Tsichritzis,  May  1974 

*  C?PG- 

•35  ON  IMPLEMENTATION  OF  RELATIONS 

D.  Tsichritzis,  May  1974 

*  CSRG- 

•36  SIX  PL/I  COMPILERS 

D.P.  Wortman,  P.J.  Khaiat,  and  D. M.  Lasker 

August  1974 

CGRG- 

•37  A  METHODOLOGY  FOP  STUDYING  THE  PSYCHOLOGICAL  COMPLEXITY 
OF  COMPUTER  PROGRAMS 

Laurence  M.  Weissman,  August  1974 
[Ph.D.  Thesis,  DCS  1974] 

CGRG- 

CGP  G- 

•38  AN  INVESTIGATION  OF  A  NEW  METHOD  OF  CONSTRUCTING 
SOFTWARE 

David  M.  Lasker,  September  19*^4 
[M.Sc.  Thesis,  DCS  1974] 

•39  AN  ALGEBRAIC  MODEL  FOR  STRING  PATTERNS 

Glenn  F.  Stewart,  September  1974 
[M.Sc.  Thesis,  DCS,  1974] 
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RDHCATION^^L  DATA  BASE  SYSTEM  USEE'S  MANUAL 
J.  Klebanoff,  F.  Lochovsky,  A.  Rozitis,  and 

D.  Tsichritzis,  Septeab^r  1974 

NOTES  FROM  A  WORKSHOP  ON  THE  ATTAINMENT  OF 
RELIABLE  SOFTWARE 

David  B.  Wortman  (ed.),  September  1974 

THE  PROJECT  SUE  SYSTEM  LANGUAGE  REFERENCE  MANUAL 
B.L.  Clark  and  F.J.B,  Ham,  September  1974 

A  DATA  BASE  PROCESSOR 

E. A.  Ozkarahan,  S.A.  Schuster  and  K.C.  Smith, 

November  1974 

MATCHING  PROGRAM  AND  DATA  REPRESENTATION  TO  A 

COMPUTING  ENVIRONMENT 

Eric  C. P.  Hehner,  November  1974 

[Ph.D.  Thesis,  DCS,  1974] 

THREE  APPROACHES  TO  RELIABLE  SOFTWARE;  LANGUAGE 
DESIGN,  DYADIC  SPECIFICATION,  COMPLEMENTARY  SEMANTICS 
J,E,  Donahue,  J,D.  Gannon,  J.V.  Guttag  and 
J.J.  Hornina,  Dec^^mber  1974 

THE  SYNTHESIS  OF  OPTIMAL  DECISION  TREES  FROM 
DECISION  TABLES 

Helmut  Schumacher,  Decemb'=r  1974 
PM. Sc.  Thesis,  DCS,  1974] 

LANGUAGE  DESIGN  TO  ENHANCE  PROGRAMMING  RELIABILITY 
John  D .  Gannon ,  Jan uar y  197  5 
[Ph.D.  Thesis,  DCS,  1975] 

DETERMINISTIC  LEFT  TO  RIGHT  PA.RSING 
Christopher  J.M.  Turnbull,  January  1975 
[Ph.D.  Thesis,  EE,  1974  ] 

A  NETWORK  FRAMEWORK  FOR  RELATIONAL  IMPLEMENTATION 
D.  Tsichritzis,  ’^’ebruary  1975 

A  UNIFIED  APPROACH  TO  FUNCTIONAL  DFFENDENCIES 
AND  RELATIONS 

p.A.  Bernstein,  J.R.  Swenson  and  D.C.  Tsichritzis 
February  1*575 

7, ETA:  A  PROTOTYPE  RELATIONAL  DATA  BASE 

MANAGEMENT  SYSTEM 

M.  Prodie  (ed) .  February  1975 

AUTOMATIC  GENERATION  OF  SYNTAX-REPATRING  AND 
PARAGRAPHING  PARSERS 
David  T.  Barnard,  March  1975 
[H.Sc.  Thesis,  DCS,  1975] 

QUERY  EXECUTION  AND  INDEX  SELECTION  FOR  RELATIONAL 
DATA  BASES 

J.H.  Gillss  Farley  and  St^^wart  A.  Schuster,  March  1975 


CSPG-54  AN  ANNOTATED  BIBLIOGRAPHY  ON  COMPUTER 
PROGRAM  ENGINEERING 

J.V.  Guttag  (ed.)»  Third  Edition,  April  1975 

CSPG-55  STRUCTURED  SUBSETS  OF  THE  PL/1  LANGUAGE 

Richard  C.  Holt  and  David  B.  Wortman,  May  1975 

CSRG-56  FEATURES  OP  A  CONCEPTUAL  SCHEMA 
D.  Tsichrit2is,  June  1975 

CSRG-57  MERLIN:  TOWARDS  AN  IDEAL  PROGRAMMING  LANGUAGE 
Eric  C.R,  Hehner,  July  1975 

CSPG-58  ON  '^HE  SEMANTICS  OF  THE  RELATIONAL  DATA  MODEL 
Hans  Albert  Schmid  and  J.  Richard  Swenson, 

July  1975  r Proceedings  of  the  ACM  SIGMOD 
Conference,  1975] 
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